Buying a house requires a lot of careful planning. Once we have finalized your budget and the house that we want to buy, we must ensure sufficient funds to pay the seller.
With rising property rates, most people avail home loans to buy their dream houses. The bank only lends up to 80% of the total amount based on a person's finances (salary, outgoing expenses, existing loans, etc.). We will need to make the rest of the payment yourself after the bank tells you how much they can lend.
We need to predicting loan amount that can be sanctioned to applicant based on available features.
| Columns | Column Description |
|---|---|
| Customer ID | Represents a unique identification number of a customer |
| Name | Represents the name of a customer |
| Gender | Represents the gender of a customer |
| Age | Represents the age of a customer |
| Income (USD) | Represents the income of a customer |
| Income Stability | Represents whether a customer has a stable source of income |
| Profession | Represents the profession of a customer |
| Type of Employment | Represents the type of employment of a customer |
| Location | Represents the current location that a customer resides |
| Loan Amount Request (USD) | Represents the loan amount requested by a customer |
| Current Loan Expenses (USD) | If a customer has any current active loans, then this represents the amount that a spends on these loans (monthly) |
| Expense Type 1 | Represents a type of expense that a customer spends on (monthly) |
| Expense Type 2 | Represents a type of expense that a customer spends on (monthly) |
| Dependents | Represents whether a customer has any dependencies (spouse, parents, siblings, children, etc.) |
| Credit Score | Represents the credit score of a customer |
| No. of Defaults | Represents the number of time a customer has defaulted |
| Has Active Credit Card | Represents if a customer has any active credit cards or not |
| Property ID | Represents an identification number of a property |
| Property Age | Represents the age of a property |
| Property Type | Represents the type of property |
| Property Location | Represents the location of a property |
| Co-Applicant | Represents whether a customer has co-applicants |
| Property Price | Represents the selling price of a property |
| Loan Sanction Amount (USD) | Represents the loan sanctioned amount for a customer |
import pandas as pd
import plotly.express as px
df = pd.read_csv('train.csv', na_values=[-999, '?', ' '])
print(df.shape)
df.head()
df.info()
na_percentage = (df.isna().sum() / df.shape[0]) * 100
na_percentage
df.drop(['Customer ID', 'Name', 'Property ID'], axis=1, inplace=True)
df.describe()
print('Unique Property Type', df['Property Type'].unique())
print('Unique Co-Applicant', df['Co-Applicant'].unique())
df['Property Type'] = df['Property Type'].astype('object')
df['Co-Applicant'] = df['Co-Applicant'].astype('object')
df.describe(include='object')
df['Income Stability'].fillna(df['Income Stability'].mode().values[0], inplace=True)
df['Gender'].fillna(df['Gender'].mode().values[0], inplace=True)
fig = px.histogram(df, x='Loan Sanction Amount (USD)', color='Gender', nbins=50)
fig.show()
fig = px.histogram(df, x='Gender', color='Income Stability', barmode='group')
fig.show()
fig = px.histogram(df, x='Gender', color='Profession', barmode = 'group')
fig.show()
fig = px.histogram(df, x='Type of Employment', color='Gender', barmode='group')
fig.show()
fig = px.histogram(df, x='Has Active Credit Card', color='Gender', barmode='group')
fig.show()
fig = px.histogram(df, x='Property Type', color='Gender', barmode='group')
fig.show()
fig = px.histogram(df, x='Property Location', color='Gender', barmode='group')
fig.show()
fig = px.histogram(df, x='Co-Applicant', color='Gender', barmode='group')
fig.show()
data = pd.DataFrame(df.groupby(['Co-Applicant', 'Gender'])['Loan Sanction Amount (USD)'].mean()).reset_index()
fig = px.bar(data, x='Co-Applicant', y='Loan Sanction Amount (USD)', color='Gender', barmode="group")
fig.show()
no_sanction_df = df[df['Loan Sanction Amount (USD)'] == 0]
fig = px.histogram(no_sanction_df, x='Gender', color='Profession', barmode = 'group')
fig.show()
fig = px.histogram(no_sanction_df, x='Type of Employment', color='Gender', barmode = 'group')
fig.show()
fig = px.histogram(no_sanction_df, x='Co-Applicant', color='Gender', barmode = 'group')
fig.show()
fig = px.histogram(no_sanction_df, x='Property Age', barmode='group', nbins=10)
fig.show()
fig = px.histogram(no_sanction_df, x='Property Price', barmode='group', nbins=10)
fig.show()
fig = px.histogram(no_sanction_df, x='Dependents', barmode='group', nbins=10)
fig.show()
fig = px.histogram(no_sanction_df, x='Credit Score', barmode='group', nbins=50)
fig.show()
fig = px.histogram(no_sanction_df, x='No. of Defaults', barmode='group')
fig.show()
fig = px.histogram(no_sanction_df, x='Age', barmode='group')
fig.show()
fig = px.histogram(no_sanction_df, x='Current Loan Expenses (USD)', barmode='group')
fig.show()
data = df.corr()
fig = px.imshow(data, color_continuous_midpoint=0, color_continuous_scale='ice')
fig.show()